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Abstract —In this work, we propose a novel microaneurysm 
(MA) detection for early diabetic retinopathy screening using 
color fundus images. Since MA usually the first lesions to appear 
as an indicator of diabetic retinopathy, accurate detection of MA 
is necessary for treatment. Each pixel of the image is classified 
as either MA or non-MA using a deep neural network with 
dropout training procedure using maxout activation function. 
No preprocessing step or manual feature extraction is required. 
Substantial improvements over standard MA detection method 
based on the pipeline of preprocessing, feature extraction, 
classification followed by post processing is achieved. The 
presented method is evaluated in publicly available Retinopathy 
Online Challenge (ROC) and Diaretdblv2 database and achieved 
state-of-the-art accuracy. 

Keywords: Diabetic Ratinopathy, deep neural network, microa¬ 
neurysms. 


1. Introduction 

In recent days diabetic retinopathy (DR) is one of the most 
common severe eye diseases causing blindness in developing 
and developed countries. According to WHO m DR is the 
primary pathology for 4.8% of the 37 million blindness cases 
around the world. Since DR is a progressive disease, early 
stage detection and treatment can save the patient from losing 
sight. For analyzing progress in disease fundus image of 
patient need be checked regularly. Fast and reliable automatic 
computer aided diagnosis system will reduce the burden on 
specialists and will give better performance for DR mass 
screening. In most of the DR screening system sensitivity and 
specificity is used as efficiency measurement. 

In general, MA appears as the first lesson for diabetic 
retinopathy. Reliable detection of MA has major importance 
for diabetics screening purpose. In color fundus images MA 
appears as small red dots with the very small radius less than 
that of the major optic vain. In reality, these are tiny swollen 
capillaries in the retina, can discharge blood leading to other 
pathological symptoms such as exudates, hemorrhages etc. 
Various challenges such as vessels bifurcations and crossing, 
illumination and contrast changes, artifacts, degradation of the 
image due to imaging device setup etc. appear in automatic 
fundus image based DR screening system. A full proof DR 
screening system is capable of the detection of clinical features 
such as exudate, MA, hemorrhages, cotton wool spots and 
blood vessel damages. A recent state-of-the-art method for 
exudate and cotton wool spots detection was presented by 
Haloi et al. m The MA detection is a well-investigated 
research area for DR mass screening system. Our motivation 



Fig. 1: Typical Pathological Retinal Image 


of these works is to present a new method to detect MA 
under the different challenging situation and to achieve high 
sensitivity and specificity. Fig. depicts typical retinal image 
with pathological features such as microaneurysms, exudates 
etc and non-pathological features such as the optic disc, the 
macula etc. The complexity of MA detection can be observed 
from Fig. 

Most of the MA detection works presented till now have a 
common pipeline of three to four stages; first preprocessing 
the image, secondly manual feature extraction followed by 
classification and a postprocessing step. Also, the use of the 
high contrast green channel of the fundus image is very com¬ 
mon in MA detection research. In general, existing methods 
used a morphological method, filtering based and supervised 
classification using hand crafted features etc. Antal et al. 
m had developed ensemble-based microaneurysms detection 
system and claimed first prize in ROC online challenge and 
also achieved a good result on another dataset. 

They have ensemble several preprocessing and candidate 
extraction method to develope their final model. But they 
didn’t addressed the problems of degradation and illuminance 
changes. Also Quebec et al. O Proposed template matching- 
based method for MA detection in wavelet domain and 
developed optimally adapted wavelet family. Their method 
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Fig. 2: Method Overview 
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using two techniques for neighbouring data suppression, 
specifically foveation and nonuniform sampling. The concept 
of foveation originated from uneven size and organization of 
photo-receptive cells and ganglions in the human eye. Visual 
acuity is maximum in the middle of the retina termed as fovea 
and decreases towards the periphery of the retina. Foveation 
proved to be very effectively in nonlocal means denoising 
algorithms lEH. In foveation central section of the window 
is focused, while the peripheral pixels are defocused using 
linear space invariant gaussian blur. The standard deviation of 
gaussian blur kernels increases with distance from the central 
section. 


prone false detection and true rejection due to haemorrhages 
and big vessels respectively. Neijmer et al. im combined 
previously existed method for candidate extraction and used 
pixel-wise classification using manually designed features. In 
another works by pereira et al. ifT^ exploited multi-agent 
system for MA segmentation preceded by gaussian and kirsch 
filter based preprocessing. Final MA candidates evolves from 
multi-agent interaction with preprocessed image. Lazer et 
al. (H microaneurysms was detected using rotating cross 
section profiles based method which depends on circularity 
and diameter of MAs. For each profiles peak was detected 
and features such as size, height and shape was calculated. 

In this work, we propose a deep learning based pixel- 
wise MA classification method invariant to luminance, contrast 
changes and artifacts. No image based preprocessing or feature 
extraction stages is required. In addition to that this method 
performance independent on vessel structures, the optic disc 
and the fovea. Hence extraction or detection of these features 
are not required. To increase the accuracy of the method 
dropout training with maxout activation function is used. 
Training of this network is time consuming but testing phase 
is very fast and suitable for real-time applications. We have 
achieved state-of-the-art performance with a very low false 
positive rate on publicly available datasets. 

II. Methods 

Microaneurysms (MA) usually follow a gaussian-like inten¬ 
sity distribution and have isolated structures from neighbours. 
To detect this tiny structures a pixel based deep neural network 
(DNN) ll^ is developed. Pixel based classification is useful 
for this type of complex detection. Every pixel of the image is 
classified as MA or non-MA. For any given pixel, class label 
is predicted using three color channel RGB values in a square 
window centered on that pixel of size w. The window around 
the given pixel may contain other MA. An overview of the 
method has been depicted in Fig. 

A. Data Manipulation 

Because of local maximum structures of MA; rest of 
pixels of the window centered on that pixel need to be 
processed efficiently to get high classification probability. 
For the account of this effect input data was modified by 


It has been observed that increasing input window size 
in DNN improves performance significantly, but at the same 
computational time complexity also increases. Nonuniform 
sampling was used to selectively depreciate window pixels 
towards the periphery. Only central section of input window is 
sampled at full resolution, while sampling resolution decreases 
towards the periphery. Using this method large window can be 
trained with relatively fewer neurons. 

B. Network Architecture 

DNNs are hierarchical neural networks, inspired by the 
simple and complex cells in the human primary visual cortex. 
A DNN comprised of convolutional layer alternate wth max¬ 
pooling layer 1^ followed by fully connected layers and a 
final classification layer. DNN very definite power of learning 
discriminative features from raw image patches make it effi¬ 
cient for computer vision tasks, in comparisons to traditional 
handcrafting features. The network used in this work contains 
five layers including the classification layer; the first three 
are comprised of convolutional layers each followed by max¬ 
pooling. The convolutional layers are followed by one fully 
connected hidden layers and the softmax classification layer is 
fully connected with two neurons for MA and non-MA. In this 
work, we have incorporated dropout CD training algorithm 
for three convolutional layers and one fully connected hidden 
layer. And maxout activation function is used for all layers in 
the network except the softmax layer. 

1) Convolutional Layer: The convolutional layer CD is the 
core building block of a deep neural network parameterized by 
the input volume size Mi x Mi x D/, the receptive filed or filter 
size F, the depth of conv layer K and the stride or skipping 
factor S. If input border is zero padded with size of Pi then 
number of neurons in the output volume Mg x Mg x Dg in is 
calculated as follows. 


Mg 


Mi-F^2P 


FUDg=K 


( 1 ) 


Stride should be chosen such that Mg is an integer. 

2) Max-Pooling Layer: Max-pooling layer ensures fast 
convergence in comparison to traditional neural networks. In 
addition to that max -pooling provides translation invariance. 
The input image is partitioned into a set of non-overlapping 
rectangles and the maximum value of each subregion is chosen 














for output. If W{kJ) are subregions then the output is obtained 
as follows. 


yk^i = max xtj (2) 

ijew{k,l) 

Suppose the input volume of size Mi x Mi x Di for max-pooling 
layer with spatial extent F and skipping factor S; then output 
volume of size Mo x Mo x Do is calculated as follows. 

Mo = ^^^^ + \\Do = Di (3) 

If value of F > S, then the process is called overlapping 
pooling, in general model with overlapping pooling is less 
prone to overfit. 

3) Dropout and maxout: Dropout mi is one of the most 
important improvements in machine learning, proved to be 
successful in many application. It has been observed that 
combining the output of many models improve accuracy 
significantly, nut in case of deep neural networks training many 
models more than computationally costly. Nitish et al. mi 
introduced dropout training for deep neural networks, means 
to reduce overfitting by randomly omitting the output of each 
hidden neuron with a probability of 0.5. Training is similar to 
the standard neural network using stochastic gradient descent. 
The only difference is that dropped out neurons don’t take 
part in forward pass and backpropagation. Suppose a neural 
network model with L hidden layers and W, h are weights 
and biases matrix of the network. If I G 1,2, ...,L is hidden 
layer index; and denote vector of inputs and outputs 
respectively at layer 1. The following equation described feed 
forward operation. 

Using dropout training, the feed forward equation becomes 



Fig. 3: Dropout training illustration 



Fig. 4: Maxout activation function illustration 


et al. ifTTl . the maxout network. Maxout is a new kind of 
activation function for the deep neural network with dropout 
training procedure. In maxout algorithm, the input is divided 
into the activation function into k unit groups and maximum 
response is recorded. Fig. [^depicts typical situation of maxout 
activation function. Given a input xeR^, a. maxout hidden uint 
hi implements the following function 

~ + bij 

hi{x) = max z// 


{riY = Bernoulli{p) 

^ 

^ (5) 

y+i=y^;+i) 

where f any activation fucntion, in our case f is maxout 
activation function. 

A typical situation of dropout training has been explained 
in Fig. the black circle denotes dropped out node from the 
network. Dropped out nodes do not participate in training and 
testing. 

The conventional way to represent a neuron’s output / as 
a function of its input x with f{x) = (1 or f{x) = 

tanh{x). Problems arise with this type of function in gradient 
descent training, as these functions saturate early with positive 
and negative v values. Gradient descent stuck in this type of 
function, but lots better improvement can be achieved slightly 
modifying the activation function as proposed by Goodfellow 


where W G and b G 

C. Learning nets 

From pixel level expert annotated ground truth each pixel is 
considered either as MA or non-MA. The training set consists 
of windows centered on image pixels. If a window lies party 
outside of the image border, rest of the pixels are derived 
by horizontal refiections. Windows with a MA pixel at the 
center is considered as MA samples and that of with non- 
MA considered for non-MA samples for training. Moreover 
to reduce overfitting and to ensure rotational invariance the 
most common method is to enlarge the dataset using random 
rotations and using horizontal refiections for border pixels. 

The training procedure for dropout neural networks us¬ 
ing maxout activation function resemblance with traditional 
neural networks except a few things. In the case of dropout 
network learning each neuron is dropped with a probability 
of Bernouli{p), resulting a thinned network. In addition to 
that forward and backpropagation are done only on this 
thinned network. Convergence of stochastic gradient descent 






has got much better improvement in this network using maxout 
activation function. Also, one particular form of regularization 
specifically constraining the norm of the incoming weight 
vector at each hidden unit found to be especially useful for 
dropout training. This is termed as max-norm regularization 
inspired from the previous use in the context of collaborative 
filtering 1^ . 

Am/ = - (1 - > 

w^ = w^“'+Aw^ (7) 


the weight norm constraint only for fully connected layers. 

e* = £o/ 

- f)OT/ t <T (8) 

t>T 

where c is a fixed constant, t is the iteration index, £ is the 
learning rate,m is the momentum variable. 

III. Experiments, Results and Discussions 

This method have been tested on publicly available ROC 
da, Messidor da and Diaretdblv2 d3 dataset. Both of 
these well annotated with pixel-wise labelling, which facili¬ 
tates the design of our pixel-wise classification model. ROC 
contain 50 training image of 768*576 pixels, Messidor consists 
of 1200 losslessly compressed images with 45 degrees field 
of view and Diaretdblv2 includes 89 images of 1500*1152 
pixels. The images of Messidor were captured using 8 bits 
per color plane at 1440*960, 2240*1488 or 2304*1536 pixels. 
Each image is provided with a grading score of RO to R3. RO 
and R1 correspond to no DR and mild DR respectively; where 
as R2 and R3 are sever DR and proliferate DR respectively. 
The grading based on a number of MAs and Haemorrhages 
with presence or absence of neovascularization. No grading 
scheme available for ROC and Diaretdblv2 datasets. Not all 
the images of have MA as pathological features. Pixels with 
another label such as haemorrhages, blood vessels crossings 
(between two different vessels) and bifurcations (one vessel 
originated from another one) and end point of disconnected 
vessels are considered as non-MA samples. Eor each pixel, 
input network receives six different windows using data 
augmentation. Eor each pixel, three windows are obtained 
using vertical and horizontal mirroring. And then, each win¬ 
dow was modified using foveation and nonuniform sampling 
producing two final windows, this setting emphasizes the 
central pixel and efficient use of bigger window size. Images 
were taken at different conditions with different cameras with 
their native resolution and compression settings. The retinal 
specialist annotations were obtained from a combination of 
three ophthalmologists with retinal fellowship training. All 
experiments are conducted on a Ubuntu machine with 12GB 
RAM, Intel i7 3.10GHz processor, and NVIDIA GTX 590 
graphics card with 1024 CUDA cores. We use Pylearn2 
machine learning library built on the top of Theano. Pylearn2 
come with an efficient implementation of dropout training with 


maxout activation function. Total 90000 MA and 1.5 million 
non-MA windows were used to train the network. While 
constructing the non-MA windows it has been emphasised to 
include an extensive number of possible false positives. And a 
small number of trivial non-MA windows were included. This 
setting helps the network to learn proper distinctive features. 

Eor accuracy analysis for exudates detection we will com¬ 
pute true positive (TP) a number of exudates pixels correctly 
detected, false positive (EP) a number of non-exudate pixels 
which are detected wrongly as exudate pixels, false negative 
(EN) number of exudate pixels that were not detected and 
true negative (TN) a number of no exudates pixels which 
were correctly identified as non-exudate pixels. Eor better 
representation of accuracy sensitivity and specificity at pixel 
level was used as our measurement. Thus the global sensitivity 
SE and the global specificity SP and accuracy AC for each 
image are defined as follows. 


AC = 


SE = 


PRED = 


TP 


TP^EN 

TP 


TP^EP 
TN 

tnTfp 

TP^TN 


SP = 


TP^TN^EP^EN 


(9) 


The best network architecture is depicted in Table. I with 
three convolutional layers each followed by a max-pooling 
layer and one fully connected layer. And a softmax layer 
on the top of the network with two neurons for MA and 
non-MA probability values. The probability of dropping a 
neuron on each layer is shown in Bernouli{p) column. Each 
convolutional layer has size 5x5 with a stride of 2 pixels 
for the first layer and 1 pixel for next two layers. Overlapping 
pooling is used in each of the max-pooling layers with a stride 
of 2 pixels and pooling regions size 3x3. 


TABLE I: Network Architecture 


Layer 

Type 

Maps & Neurons 

Size 

Stride 

Bern(p) 

0 

input 

3 X 129 X 129 



0.1 

1 

Conv 

64 X 63 X 63 

5x5 

2 

0.2 

2 

MP 

64x31 X 31 

3x3 

2 


3 

Conv 

64 X 27 X 27 

5x5 

1 

0.2 

4 

MP 

64 X 13 X 13 

3x3 

2 


5 

Conv 

64 X 9 X 9 

5x5 

1 

0.5 

6 

MP 

64 X 4 X 4 

3x3 

2 


7 

PC 

290 

1 X 1 


0.5 

8 

PC 

2 

1 X 1 




To detect MA in an unseen image, we first apply a mask to 
get all pixels of interest removing the usual black region ap¬ 
pears during fundus photography. Also, a color threshold was 
defined to left out trivial non-MA pixels to reduce computation 
time. The window of size 129 x 129 centered at each image 
pixel is extracted, for pixels nearby boundaries windows were 


















Fig. 5: Detected MA pixels at the center of the windows 

extracted using horizontal mirroring. Each window has R, G, 
B color channel. The detector will assign a probability value 
of being MA and non-MA to each pixel in the image. Finally, 
a probability map of being MA is generated for the testing 
image. 

To remove possible false detection each connected region 
of probability map is processed using the concept of convexity 
and area of the region. Let’s consider M is the set of all MA 
pixels then 

• N: Number of connected region in probaility Map. 

• For each region update M = M [J PR\Areapj^ < 21 D 
Convexity >0.8 

• Areapj^ is the area of the region Pp and Convexitypj^ is the 
convexity 

This will ensure that no vessels crossing, bifurcations and 
haemorrhages are included in MA detection. 

For a typical image pixel based detection output is shown in 
Fig. We have observed that this method can reliably detect 
MA candidates. 

A comparison of this method with existing DR screening 
system is shown in Table II. Even though this comparison is 
not done on the same ground since dataset and the proportion 
of images having DR symptoms are different. But Sensitivity 
(Sens), Specificity (Spec) and area under the curve (AUC) 
value can be accepted for mutual comparison. Our method 
performs significantly better than the existing methods. 


TABLE II: Comparison of automatic DR screening systems. 


Method 

DR(%) 

Sens 

Spec 

AUC 

Proposed Method 

46 

97% 

95% 

0.988 

Antal et al. O 

46 

90% 

91% 

0.989 

Agurto et al. IH 

76.26 

NA 

NA 

0.89 

Abramoff et al. O 

4.8 

84% 

64% 

0.84 

Jelinek et al. ||6l 

30 

85% 

90% 

NA 


This method achieved lower false positive rate than other 
existing systems. A comparison of sensitivity vs an average 



Avg False Positive/image 


Fig. 6: Comparisons of Sensitivity vs Average Number False 
positive pixels per image 



Fig. 7: Comparisons of Sensitivity vs 1-Specificity 


number of false positive pixels per image is shown in Fig. 1^ 
Variations of sensitivity with 1-specificity are shown in Fig.^ 
Also for comparison purpose result of one existing method 
also plotted in this figure. 

Table III shows a comparison of this method on Messidor 
dataset with recent state-of-the-art system for the scenario 
RO vs Rl. It can be observed that our method achieve a 
accuracy of 95% with sensitivity and specificity of 97% and 
94% respectively. 


TABLE III: Comparisons of result on the Messidor Dataset 
for the scenario RO vs Rl 


Method 

Sensitivity 

Specificity 

Acc 

AUC 

Proposed Method 

97% 

95% 

95.4% 

0.982 

Antal et al. O 

94% 

90% 

90% 

0.942 


Also for the scenario No DR vs DR Table IV shows result 
comparisons with existing state-of-the-art method on the same 
dataset. 


TABLE IV: Comparisons of result on the Messidor Dataset 
for the scenario No DR/DR 


Method 

Sensitivity 

Specificity 

Acc 

AUC 

Proposed Method 

97% 

96% 

96% 

0.988 

Antal et al. m 

90% 

91% 

90% 

0.989 


A extensive evaluation was also carried out on ROC dataset. 
































Due to lack of testing data label we have used a part training 
data (not used in the training of this method) to evaluate 
accuracy of our method. Table V depicts comparison of AUC 
values with other methods on the same dataset. 

TABLE V: Comparisons of result on the ROC Dataset (Our 
result only on subset of train Data, Since test data label not 
available) 


Method 

AUC 

Proposed Method 

0.98 

Human Expert 

0.96 

OK Medical Q 

0.89 

Fujita Lab (51 

0.88 

LaTIM im 

0.87 


IV. Conclusion 

In this work, we have presented a deep learning based 
computer-aided system for microaneurysm detection. The deep 
network consists of 5 layers including softmax output layer 
and dropout training with maxout activation function is used to 
improve accuracy. In comparison to another existing method, 
this system does not require additional blood vessels extraction 
step, preprocessing and feature design. This method has been 
tested in publicly available datasets and achieved state-of-the- 
art performance for MA candidates extraction with low false 
positive rate, hence useful for diabetic mass screening purpose. 

Acknowledgment 

Reeerences 

[1] Who DR report http://www.who.int/blindness/causes/priority/en/index5.html 

[2] Antal, Blint, and Andrs Hajdu. ”An ensemble-based system for automatic 
screening of diabetic retinopathy.” Knowledge-Based Systems 60 (2014): 
20-27. 

[3] Haloi, Mrinal, Samarendra Dandapat, and Rohit Sinha. ”A Gaussian 
scale space approach for exudates detection, classification and severity 
prediction.” arXiv preprint arXiv: 1505.00737 (2015). 

[4] C. Agurto, E.S. Barriga, V. Murray, S. Nemeth, R. Crammer, W. Bauman, 

G. Zamora, M.S. Pattichis, P. Soliz, Automatic detection of diabetic 
retinopathy and age-related macular degeneration in digital fundus images. 
Invest. Ophthalmol. Vis. Sci. 52 (8) (2011) 58625871. 

[5] M. Abramoff, J. Reinhardt, S. Russell, J. Polk, V. Mahajan, M. Niemeijer, 

G. Quellec, Automated early detection of diabetic retinopathy. Ophthal¬ 
mology 117 (6) (2010) 11471154. 

[6] H.J. Jelinek, M.J. Cree, D. Worsley, A. Luckie, P. Nixon, An automated 
microaneurysm detector as a tool for identification of diabetic retinopathy 
in rural optometric practice. Clinical Exp. Optometry 89 (5) (2006) 
299305. 

[7] B. Zhang, X. Wu, J. You, Q. Li, and F. Karray, Hierarchical detection of 
red lesions in retinal images by multiscale correlation filtering, in SPIE 
Medical Imaging 2009: Computer-Aided Diagnosis, N. Karssemeijer and 
M. L. Giger, Eds., vol. 7260. SPIE, 2009, p. 72601L. 

[8] A. Mizutani, C. Muramatsu, Y. Hatanaka, S. Suemori, T. Hara, and H. 
Fujita, Automated microaneurysm detection method based on double ring 
filter in retinal fundus images, in SPIE Medical Imaging 2009: Computer- 
Aided Diagnosis, N. Karssemeijer and M. L. Giger, Eds., vol. 7260, no. 

1. SPIE, 2009, p. 72601N. 

[9] G. Quellec, M. Lamard, P. M. Josselin, G. Cazuguel, B. Cochener, and 
C. Roux, Optimal wavelet transform for the detection of microaneurysms 
in retina photographs, IEEE Transactions on Medical Imaging, vol. 27, 
no. 9, pp. 12301241, 2008. 

[10] Lazar, Istvan, and Andras Hajdu. ’’Microaneurysm detection in retinal 
images using a rotating cross-section based model.” Biomedical Imaging: 
From Nano to Macro, 2011 IEEE International Symposium on. IEEE, 
2011 . 


[11] Niemeijer, Meindert, et al. ’’Automatic detection of red lesions in digital 
color fundus photographs.” Medical Imaging, IEEE Transactions on 24.5 
(2005): 584-592. 

[12] Pereira, Carla, et al. ’’Using a multi-agent system approach for microa¬ 
neurysm detection in fundus images.” Artificial intelligence in medicine 
60.3 (2014): 179-188. 

[13] L. Giancardo, F. Meriaudeau, T. Karnowski, Y. Li, K. Tobin, and E. 
Chaum, Microaneurysm detection with radon transform-based classifica¬ 
tion on retina images, in Engineering in Medicine and Biology Society, 
EMBC, 2011 Annual International Conference of the IEEE. IEEE, 2011, 
pp. 59395942. 

[14] Methods to evaluate segmentation and indexing techniques in the field 
of retinal ophthalmology. Available http://messidor.crihan.fr 

[15] Kauppi, T., Kalesnykiene, V., Kamarainen, J.-K., Lensu, L., Sorri, 
I., Raninen A., Voutilainen R., Uusitalo, H., Klviinen, H., Pietil, J., 
DIARETDBl diabetic retinopathy database and evaluation protocol. In 
Proc of the 11th Conf. on Medical Image Understanding and Analysis 
(Aberystwyth, Wales, 2007) 

[16] Ian J. Goodfellow, David Warde-Farley, Pascal Lamblin, Vincent Du- 
moulin, Mehdi Mirza, Razvan Pascanu, James Bergstra, Frdric Bastien, 
and Yoshua Bengio. ”Pylearn2: a machine learning research library”. 
arXiv preprint arXiv: 1308.4214 

[17] Goodfellow, Ian J., et al. ’’Maxout networks.” arXiv preprint 
arXiv:1302.4389 (2013). 

[18] Srivastava, Nitish, et al. ’’Dropout: A simple way to prevent neural 
networks from overfitting.” The Journal of Machine Learning Research 
15.1 (2014): 1929-1958. 

[19] LeCun, Yann, et al. ’’Gradient-based learning applied to document 
recognition.” Proceedings of the IEEE 86.11 (1998): 2278-2324. 

[20] Dan Claudiu Ciresan, Ueli Meier, and Jurgen Schmidhuber. Multi- 
column deep neural networks for image classification. In Computer Vision 
and Pattern Recognition, pages 36423649, 2012. 

[21] A. Foi and G. Boracchi, ’’Anisotropically Foveated Nonlocal Image 
Denoising”, Proc. IEEE Int. Conf. Image Process. (ICIP 2013), pp. 464- 
468, Melbourne, Australia, Sep. 2013. 

[22] Scherer, Dominik, Andreas Mller, and Sven Behnke. ’’Evaluation of 
pooling operations in convolutional architectures for object recognition.” 
Artificial Neural NetworksICANN 2010. Springer Berlin Heidelberg, 
2010. 92-101. 

[23] Nathan Srebro and Adi Shraibman. Rank, trace-norm and max-norm. In 
Proceedings of the 18th annual conference on Learning Theory, COLT’05, 
pages 545-560, Berlin, Heidelberg, 2005. Springer-Verlag. 









